-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat(e2e): support multiple aggregators in the e2e tests #2378
base: main
Are you sure you want to change the base?
Conversation
99077bb
to
9ad40fc
Compare
cf5d195
to
7c6a300
Compare
de78895
to
42b4d01
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Just some remarks
@@ -940,7 +940,7 @@ mod tests { | |||
runner | |||
.expect_inform_new_epoch() | |||
.with(predicate::eq(new_time_point_clone.clone().epoch)) | |||
.once() | |||
.times(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need to change number of calls ?
Modification on the state_machine seems to concern only the slave mode.
Does it mean we are running a slave ?
Test name say that it is a master: "idle_new_epoch_detected_and_master_has_transitioned_to_epoch"
))), | ||
} | ||
} | ||
SignerRelayMode::Passthrough => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no test in this file about this code.
Are tests useless here?
Is it tested elsewhere ?
mithril-test-lab/mithril-end-to-end/src/mithril/infrastructure.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// This should be removed when the aggregator is able to synchronize its certificate chain from another aggregator | ||
if !aggregator.is_first() { | ||
tokio::time::sleep(std::time::Duration::from_millis( | ||
5 * aggregator.mithril_run_interval() as u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe extracting the hardcoded value 5
into a named constant would improve readability and maintability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a few caveats.
self.runner.update_epoch_settings().await?; | ||
if self.config.is_slave { | ||
self.runner | ||
.synchronize_slave_aggregator_signer_registration() | ||
.await?; | ||
// Needed to recompute epoch data for the next signing round on the slave | ||
self.runner.inform_new_epoch(new_time_point.epoch).await?; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain how this change help to stabilize the e2e tests ? I'm quite puzzled over the fact that we need to call runner.inform_new_epoch
twice.
From what I understand this doesn't impact the methods called between the inform_new_epoch
calls:
runner.upkeep
call should not be impactedopen_signer_registration_round
do nothing on slaveupdate_epoch_settings
should not be impacted as the data registered by the epoch service (protocol parameters and transactions signing config) don't depends on the master aggregator
The functional impacts should be:
- epoch service will expose an incorrect list of
next_signers
in the interval between the twoinform_new_epoch
calls - epoch service will be ready earlier since a first
inform_epoch
calls will be done without needing a roundtrip to the master aggregator
Is the last point the problem on fast network ? Maybe the synchronizer should be able to "edit" the next signers in the epoch_service
instead ?
mithril-test-lab/mithril-end-to-end/src/mithril/infrastructure.rs
Outdated
Show resolved
Hide resolved
mithril-test-lab/mithril-end-to-end/src/mithril/infrastructure.rs
Outdated
Show resolved
Hide resolved
b464797
to
9e47514
Compare
9e47514
to
be638e5
Compare
First aggregator is 'master', and others (if any) are 'slave' to the 'master'.
…ggregators Better P2P relays topology and fix log files collisions.
By providing information about the targeted aggregator in logs and errors.
Which could prevent signature from signers even with loose protocol parameters.
Which can be 'Passthrough' or 'P2P'.
As master/slave signer registration is only one of the configurations to be tested.
Until we can fix the source of flakiness.
- Removed last epoch which was not necessary - Removed unnecessary cycles - Reduced the number of signers per epoch - Use of 'checked_sub' in the 'EpochFixturesMapBuilder'.
be638e5
to
3794e05
Compare
3794e05
to
b48abad
Compare
Content
This PR includes the adaptation of the e2e tests to support multiple aggregators:
Passthrough
(messages are sent to the configured aggregator endpoint) andP2P
(messages are sent to the P2P network) modes for both the signer registration and signature registration. The configuration options have been updated in that sensenumber_of_aggregators
andnumber_of_signers
are specified instead ofnumber_of_pool_nodes
use_p2pmode
has been replaced by more appropriateuse_relays
relay_signer_registration_mode
andrelay-signature_registration_mode
have been added (used with theuse_relays
option)RunOnly
mode of the e2e test has been adapted to support concurrently multiple aggregatorsSpec
mode of the e2e test has been adapted to support concurrently multiple aggregatorsPre-submit checklist
Issue(s)
Closes #2361